The Steady-State Control Problem for Markov Decision Processes
نویسندگان
چکیده
This paper addresses a control problem for probabilistic models in the setting of Markov decision processes (MDP). We are interested in the steady-state control problem which asks, given an ergodic MDPM and a distribution δgoal, whether there exists a (history-dependent randomized) policy π ensuring that the steady-state distribution ofM under π is exactly δgoal. We first show that stationary randomized policies suffice to achieve a given steady-state distribution. Then we infer that the steady-state control problem is decidable for MDP, and can be represented as a linear program which is solvable in PTIME. This decidability result extends to labeled MDP (LMDP) where the objective is a steady-state distribution on labels carried by the states, and we provide a PSPACE algorithm. We also show that a related steady-state language inclusion problem is decidable in EXPTIME for LMDP. Finally, we prove that if we consider MDP under partial observation (POMDP), the steady-state control problem becomes undecidable.
منابع مشابه
Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملA Markov Model for Performance Evaluation of Coal Handling Unit of a Thermal Power Plant
The present paper discusses the development of a Markov model for performance evaluation of coal handling unit of a thermal power plant using probabilistic approach. Coal handling unit ensures proper supply of coal for sound functioning of thermal Power Plant. In present paper, the coal handling unit consists of two subsystems with two possible states i.e. working and failed. Failure and repair...
متن کاملSteady state availability general equations of decision and sequential processes in Continuous Time Markov Chain models
Continuous Time Markov Chain (CMTC) is widely used to describe and analyze systems in several knowledge areas. Steady state availability is one important analysis that can be made through Markov chain formalism that allows researchers generate equations for several purposes, such as channel capacity estimation in wireless networks as well as system performance estimations. The problem with this...
متن کاملMarkov Chain Anticipation for the Online Traveling Salesman Problem by Simulated Annealing Algorithm
The arc costs are assumed to be online parameters of the network and decisions should be made while the costs of arcs are not known. The policies determine the permitted nodes and arcs to traverse and they are generally defined according to the departure nodes of the current policy nodes. In on-line created tours arc costs are not available for decision makers. The on-line traversed nodes are f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013